OVERVIEW: This exploratory project aims to evaluate variables, as measurable by the US Government’s American Community Survey (ACS), that may contribute to people’s and particular demographics’ risk’s related to wildfires in and around the Bay Area. Further, this assessment explores racial disparities within these risk variables. As a motivating premise, the recent Dixie fire being categorized as one of, if not the, largest burning fire in California history sparked this project.
This project is divided into four distinct sections: plotting racial disparities of selected risk variables, mapping these disparities, performing a logistic regression evaluating if particular PUMA/geographic locations are correlated with higher wildfire risk, and finally mapping the Bay Area’s Fire history/high risk geographies.
Defining RISK:
For the intents and purposes of my project, wildfire riskiness will be characterized by the following baseline variables from the ACS: household income, the presence/lack of adequate fire insurance, the presence/lack of health insurance, and vehicle availability to responding households.
I chose household income as available finances may aid households in escaping, preventing, or mitigating wildfire destruction or burdens.
I chose the presence/lack of health insurance, as this puts an individual at a higher risk of death, burden, or injuries due to wildfire-related complications. This risk indicator may be reflective of their financial situation.
I chose the presence/lack of fire insurance because a lack of adequate fire insurance (coverage below 1000 dollars annually) places individuals at direct risk of wildfire burdens relative to their better-covered neighbors/peers.
I chose vehicle availability as households with vehicles may be able to evade/escape wildfires or help fight/mitigate fire more effectively than their less-mobile counterparts.
EXPLORATION 1: PLOTTING DISPARITIES
GOAL: To plot a baseline and general understanding of how each of these factors that may influence wildfire risk in the Bay Area breakdown by race.
Regarding household income there is a clear trend that, largely, as income level increases across the bay area, White and Asian populations are more and more represented, eventually being over representative at incomes over $200,000 annually. In juxtaposition, African Americans, American Indians, and the other race category(including non-white Hispanics) all become more underrepresented as income increases.
Regarding Healthcare coverage, this simple race-composition plot demonstrates that White, Asian, and Black respondents in the Bay area are under represented in the no healthcare coverage, which is good; however, there is a glaring over representation in the Other Race category.
For transportation methods, white Bay residents are considerably over represented in the worked from home the “Taxicab, motorcycle, bicycle, or other means” categories whereas African Americans are over represented in the “public transportation category. The”Other Races" category is overrepresented in the “carpooling category”.
For the poverty-level race plot, African Americans, Native Americans, and “Other Races” are overrepresented in the ’below poverty line analysis, whereas the same can be said for White and Asian bay residents at or above the poverty line.
EXPLORATION 2: MAPPING DISPARITIES
GOAL: To map the distribution of each of my ‘wildfire risk’ variables. To reiterate, I will map the PUMS region distributions for the following risk variables: income, lack of health coverage, lack of an accessible vehicle, and what percentage of individuals have fire insurance.
There appear to be possibly significant pockets of low income individuals around the bay, in Berkeley, San Jose, San Francisco, Burlingame, and Oakland.
There appear to be possibly significant pockets of communities lacking healthcare coverage in Santa Rosa, Richmond, Oakland, South San Francisco, and Across San Jose down to Morgan Hill and Gilroy.
As expected, the concentration of households without access to a vehicle reach upwards of 50% in San Francisco city; however, across the bay area, particularly in the east bay, there are stable populations without access to vehicles (East Bay hovers around 7%-20%) of households.
Interestingly, for this fire insurance map, I decided to flip to see what percentage of households are covered, rather than the percentage of those who lack coverage. I was surprised to see that the Bay does not have much coverage above $1000 dollars per year. The most insured Bay Area PUMS, from roughly Berkeley to Brentwood, were still hovering at only about 65% of households properly insured, leaving a significant portion (35%) of households under insured by this project’s standards. This was the trend across the Bay Area.
EXPLORATION 3: LOGISTIC REGRESSION
GOAL: Measure if predictor variables (particularly PUMA + Race) significantly effect firerisk. Firerisk, in this analysis, is definitionally low income and uninsured.
The LOGIT model is interpretable as such:
the deviance residuals show that the distribution is very roughly a good fit model with a similar min and max value (-2.348, 2.849).
race2 (Black or African American alone) is statistically significant, changing the log odds of fire risk .57 (increase).
race3 (Native American alone) is statistically significant, changing the log odds of fire risk -2.13 (decrease).
race7 (Native Hawaiian or Pacific Islander Alone) is statistically significant, changing the log odds of fire risk -1.18914 (decrease).
race8 (Some Other Race alone) is statistically significant, changing the log odds of fire risk -0.59276 (decrease).
There is a statistically significant difference between having no vehicles and having up to three, with decreasing fire risk as the unit number of available vehicles increases.
Not having health coverage was not statistically significant, I imagine because the after effects of fire are not captured in this analysis.
Finally, living in particular PUMAs, per the presented data, is a statistically significant variable affecting the log odds of fire risk in the bay area. Where one lives is statistically significant in their fire risk, even when controlling for income levels and insurance status. Living in some PUMAs like PUMA00107 may increase one’s log odds of fire risk by 2.89 whereas others like PUMA08103 may decrease the log odds of fire risk by 1.25.
##
## Call:
## glm(formula = firerisk ~ race + health_coverage + vehicle + PUMA,
## family = quasibinomial(), data = bay_pums_clean)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.348 0.000 0.000 0.000 2.849
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -32.94357 186.34788 -0.177 0.859678
## race2 0.57623 0.06607 8.722 < 2e-16 ***
## race3 -2.13073 0.28620 -7.445 9.95e-14 ***
## race5 14.68805 131.28550 0.112 0.910920
## race6 -0.10055 0.06115 -1.644 0.100154
## race7 -1.18914 0.14656 -8.114 5.09e-16 ***
## race8 -0.59276 0.06299 -9.410 < 2e-16 ***
## race9 0.09146 0.08935 1.024 0.306018
## health_coverage2 33.46150 186.34785 0.180 0.857495
## vehicle0 -2.47400 0.06615 -37.398 < 2e-16 ***
## vehicle1 -3.48116 0.05459 -63.768 < 2e-16 ***
## vehicle2 -4.57305 0.07293 -62.705 < 2e-16 ***
## vehicle3 -5.03834 0.10727 -46.967 < 2e-16 ***
## vehicle4 -22.81338 595.81326 -0.038 0.969457
## vehicle5 -22.96053 1068.08996 -0.021 0.982849
## vehicle6 -23.01685 1425.50572 -0.016 0.987118
## PUMA00102 -1.52410 0.15472 -9.851 < 2e-16 ***
## PUMA00103 0.05233 0.19975 0.262 0.793329
## PUMA00104 -19.23404 1233.59224 -0.016 0.987560
## PUMA00105 -0.77742 0.22077 -3.521 0.000430 ***
## PUMA00106 -0.32644 0.23824 -1.370 0.170632
## PUMA00107 2.89582 0.17742 16.321 < 2e-16 ***
## PUMA00108 -17.05463 1008.96595 -0.017 0.986514
## PUMA00109 2.58345 0.22370 11.549 < 2e-16 ***
## PUMA00110 3.08446 0.16728 18.439 < 2e-16 ***
## PUMA01301 -0.63766 0.22495 -2.835 0.004590 **
## PUMA01302 2.27738 0.15829 14.387 < 2e-16 ***
## PUMA01303 0.28707 0.21655 1.326 0.184954
## PUMA01304 0.22011 0.21457 1.026 0.304995
## PUMA01305 -0.46171 0.23233 -1.987 0.046896 *
## PUMA01306 0.59046 0.23197 2.545 0.010921 *
## PUMA01307 -0.01104 0.22592 -0.049 0.961025
## PUMA01308 1.30673 0.20183 6.474 9.66e-11 ***
## PUMA01309 0.58901 0.19791 2.976 0.002921 **
## PUMA04101 0.94770 0.19225 4.929 8.29e-07 ***
## PUMA04102 2.53887 0.15277 16.619 < 2e-16 ***
## PUMA05500 0.79846 0.14002 5.703 1.19e-08 ***
## PUMA07501 -0.07427 0.15965 -0.465 0.641805
## PUMA07502 -1.15759 0.16566 -6.988 2.85e-12 ***
## PUMA07503 -0.69198 0.15736 -4.397 1.10e-05 ***
## PUMA07504 -18.64272 1052.91883 -0.018 0.985874
## PUMA07505 1.24625 0.15626 7.976 1.57e-15 ***
## PUMA07506 0.39381 0.15838 2.487 0.012904 *
## PUMA07507 0.66221 0.18044 3.670 0.000243 ***
## PUMA08101 0.01461 0.20920 0.070 0.944333
## PUMA08102 -0.42326 0.21390 -1.979 0.047845 *
## PUMA08103 -1.25373 0.21104 -5.941 2.87e-09 ***
## PUMA08104 -20.24913 953.46158 -0.021 0.983056
## PUMA08105 1.18292 0.15631 7.568 3.90e-14 ***
## PUMA08106 -0.65686 0.21476 -3.059 0.002226 **
## PUMA08501 -0.32129 0.21139 -1.520 0.128548
## PUMA08502 1.42476 0.15247 9.345 < 2e-16 ***
## PUMA08503 1.28536 0.16527 7.777 7.64e-15 ***
## PUMA08504 2.08172 0.14193 14.668 < 2e-16 ***
## PUMA08505 0.90259 0.21972 4.108 4.00e-05 ***
## PUMA08506 -18.06167 1224.78531 -0.015 0.988234
## PUMA08507 1.52808 0.20819 7.340 2.19e-13 ***
## PUMA08508 -29.11330 964.93882 -0.030 0.975931
## PUMA08509 0.46614 0.12996 3.587 0.000335 ***
## PUMA08510 -0.31363 0.21105 -1.486 0.137283
## PUMA08511 -0.77351 0.22030 -3.511 0.000447 ***
## PUMA08512 -19.82964 1028.76875 -0.019 0.984622
## PUMA08513 -19.59399 1227.70393 -0.016 0.987267
## PUMA08514 1.29253 0.17262 7.488 7.18e-14 ***
## PUMA09501 1.46511 0.15592 9.397 < 2e-16 ***
## PUMA09502 2.04991 0.15704 13.054 < 2e-16 ***
## PUMA09503 -0.02857 0.19398 -0.147 0.882921
## PUMA09701 0.63615 0.13892 4.579 4.68e-06 ***
## PUMA09702 0.05236 0.17283 0.303 0.761921
## PUMA09703 1.17372 0.12978 9.044 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for quasibinomial family taken to be 0.02340203)
##
## Null deviance: 2424.82 on 31682 degrees of freedom
## Residual deviance: 443.61 on 31613 degrees of freedom
## AIC: NA
##
## Number of Fisher Scoring iterations: 24
EXPLORATION 4: THE BAY’s FIRE HISTORY
GOAL: Visualize and plot the geographic history of wildfires in the Bay area from 1992 - 2015). This analysis highlights the Wildfire Hazard Potential data of Bay Area Counties and overlays where these high risk geographies are along with the actual fires that may overlap them.
The base, foundation code, for developing the following plots is thanks to Patrick Baylis’s Research on wildfire mapping who created an analysis of Butte county’s fire history which i based my code off of in doing a similar analysis of the Bay Area’s fire history and risk.
More information can be found here –> https://www.patrickbaylis.com/blog/2021-01-31-fire-maps/
This map shows the Wildfire Hazard Potential locations in the bay, as noted by CalFIRE, with increasing blue coloration indicating higher WHP.
## Reading layer `firep20_1' from data source
## `/Users/Hunter/Documents/GitHub/hunterdickey.github.io/fire-maps/fire20_1.gdb'
## using driver `OpenFileGDB'
## Simple feature collection with 21318 features and 17 fields
## Geometry type: GEOMETRY
## Dimension: XY
## Bounding box: xmin: -373237.5 ymin: -604727.6 xmax: 539438.2 ymax: 518283.7
## Projected CRS: NAD83 / California Albers
This second map of the bay area (above the fire insurance map for comparison) shows the Wildfire Hazard Potential zones (blue) overlapped with the Bay’s contemporary fire history. When compared to the fire insurance indicator, this map highlights a potential issue or inadequacy with the current Bay Area level of coverage for wildfire, particularly Around Sonoma, San Mateo, Santa Clara, and Alameda counties.
CONCLUDING THOUGHTS
This project helped me to see the disparities in certain ACS variables/aspects of life in the Bay Area between different races, with certain groups being more represented than others across variables and subcaagories within variables, like income.
Moreover, mapping income, health coverage, transportation method, and fire coverage helped me to see that there are significant overlaps and deficiencies in certain pockets of the Bay Area. In some cases, like that of fire coverage, these differences overlap with real calculable risks, particularly from wildfires, as seen in my Wildfire Hazard Potential and Wildfire occurrence map exercise.
Finally, my logistic regression analysis backed up some of the rough overlap I observed in my mapping exercise. My predictor variables (race, vehicle accessibility, and PUMA location) demostrated statistically significant effect on fire risk, either increasing or decreasing the log odds of fire risk in the Bay Area.